CPM: A Large-scale Generative Chinese Pre-trained Language Model

Zhang, Zhengyan; Han, Xu; Zhou, Hao; Ke, Pei; Gu, Yuxian; Ye, Deming; Qin, Yujia; Su, Yusheng; Ji, Haozhe; Guan, Jian; Qi, Fanchao; Wang, Xiaozhi; Zheng, Yanan; Zeng, Guoyang; Cao, Huanqi; Chen, Shengqi; Li, Daixuan; Sun, Zhenbo; Liu, Zhiyuan; Huang, Minlie; Han, Wentao; Tang, Jie; Li, Juanzi; Zhu, Xiaoyan; Sun, Maosong

Computer Science > Computation and Language

arXiv:2012.00413 (cs)

[Submitted on 1 Dec 2020]

Title:CPM: A Large-scale Generative Chinese Pre-trained Language Model

View PDF

Abstract:Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available. In this technical report, we release the Chinese Pre-trained Language Model (CPM) with generative pre-training on large-scale Chinese training data. To the best of our knowledge, CPM, with 2.6 billion parameters and 100GB Chinese training data, is the largest Chinese pre-trained language model, which could facilitate several downstream Chinese NLP tasks, such as conversation, essay generation, cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many NLP tasks in the settings of few-shot (even zero-shot) learning. The code and parameters are available at this https URL.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2012.00413 [cs.CL]
	(or arXiv:2012.00413v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2012.00413

Submission history

From: Zhengyan Zhang [view email]
[v1] Tue, 1 Dec 2020 11:32:56 UTC (37 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-12

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Zhengyan Zhang
Xu Han
Hao Zhou
Deming Ye
Jian Guan

…

export BibTeX citation

Computer Science > Computation and Language

Title:CPM: A Large-scale Generative Chinese Pre-trained Language Model

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:CPM: A Large-scale Generative Chinese Pre-trained Language Model

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators